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Abstract 

A number of algorithms for computing the simulation preorder are available. Let E denote the state 
space, -» the transition relation and P s i m the partition of E induced by simulation equivalence. The 
algorithms by Henzinger, Henzinger, Kopke and by Bloom and Paige run in 0(|E||->|)-time and, as 
far as time-complexity is concerned, they are the best available algorithms. However, these algorithms 
have the drawback of a space complexity that is more than quadratic in the size of the state space. The 
algorithm by Gentilini, Piazza, Policriti — subsequently corrected by van Glabbeek and Ploeger — 
appears to provide the best compromise between time and space complexity. Gentilini et al.'s algorithm 
runs in 0(|P a i m | 2 |->|)-time while the space complexity is in 0(|P s i m | 2 + |E| log |P s i m |). We present 
here a new efficient simulation algorithm that is obtained as a modification of Henzinger et al.'s algorithm 
and whose correctness is based on some techniques used in applications of abstract interpretation to 
model checking. Our algorithm runs in 0(|P s i m | H)-time and 0(|P s i m ||E| log |E|)-space. Thus, this 
algorithm improves the best known time bound while retaining an acceptable space complexity that is 
in general less than quadratic in the size of the state space. An experimental evaluation showed good 
comparative results with respect to Henzinger, Henzinger and Kopke's algorithm. 

1 Introduction 

Abstraction techniques are widely used in model checking to hide some properties of the concrete model in 
order to define a reduced abstract model where to run the verification algorithm |fl~||9] . Abstraction provides 
an effective solution to deal with the state-explosion problem that arises in model checking systems with 
parallel components [7 |. The reduced abstract structure is required at least to weakly preserve a specifica- 
tion language £ of interest: if a formula ip £ £ is satisfied by the reduced abstract model then (p must hold 
on the original unabstracted model as well. Ideally, the reduced model should be strongly preserving w.r.t. 
L: ip G Z holds on the concrete model if and only if <p is satisfied by the reduced abstract model. One 
common approach for abstracting a model consists in defining a logical equivalence or preorder on system 
states that weakly/strongly preserves a given temporal language. Moreover, this equivalence or preorder 
often arises as a behavioural relation in the context of process calculi iflOl . Two well-known examples are 
bisimulation equivalence that strongly preserves expressive logics such as CTL* and the full /i-calculus [5 1 
and the simulation preorder that ensures weak preservation of universal and existential fragments of the 
/i-calculus like ACTL* and ECTL* as well as of linear-time languages like LTL 11221 l25l . Simulation 
equivalence, namely the equivalence relation obtained as symmetric reduction of the simulation preorder, 
is particularly interesting because it can provide a significantly better state space reduction than bisim- 
ulation equivalence while retaining the ability of strongly preserving expressive temporal languages like 
ACTL*. 



State of the Art. It is known that computing simulation is harder than computing bisimulation [24|. Let 
X = (£, ->,£) denote a Kripke structure on the state space S, with transition relation -> and labeling 
function I : £ — > p(AP), for a given set AP of atomic propositions. Bisimulation equivalence can be com- 
puted by the well-known Paige and Tarjan's [26 1 algorithm that runs in 0(\^\ log |E|)-time. A number 
of algorithms for computing simulation equivalence exist, the most well known are by Henzinger, Hen- 
zinger and Kopke 0231 . Bloom and Paige [2|, Bustan and Grumberg (6), Tan and Cleaveland [29 1 and 
Gentilini, Piazza and Policriti [18|, this latter subsequently corrected by van Glabbeek and Ploeger [21 1. 
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The algorithms by Henzinger, Henzinger, Kopke and by Bloom and Paige run in 0(|E||->|)-time and, as 
far as time-complexity is concerned, they are the best available algorithms. However, both these algo- 
rithms have the drawback of a space complexity that is bounded from below by S1(|E| 2 ). This is due 
to the fact that the simulation preorder is computed in an explicit way, i.e., for any state s G E, the 
set of states that simulate s is explicitly given as output. This quadratic lower bound in the size of the 
state space is clearly a critical issue in model checking. There is therefore a strong motivation for de- 
signing simulation algorithms that are less demanding on space requirements. Bustan and Grumberg [6| 
provide a first solution in this direction. Let P s i m denote the partition corresponding to simulation equiv- 
alence on % so that |P S i m | is the number of simulation equivalence classes. Then, Bustan and Grum- 
berg's algorithm has a space complexity in 0(|P s i m | 2 + |E| log |P s i m |), although the time complexity 
in 0(|P S im| 4 (|H + | Psim | 2 ) + Psim 1 2 1 E | ( | E + |P s im| 2 |)) remains a serious drawback. The simula- 
tion algorithm by Tan and Cleaveland [29| simultaneously computes also the state partition Pbi S corre- 
sponding to bisimulation equivalence. Under the simplifying assumption of dealing with a total transi- 
tion relation, this procedure has a time complexity in 0(|->|(|Pbi s | + log |E|)) and a space complexity in 
0(|->| + |Pbis| 2 + |E| log |Pbis|) (the latter factor E| log |Pbi S | does not appear in [29] and takes into account 
the relation that maps each state into its bisimulation equivalence class). The algorithm by Gentilini, Piazza 
and Policriti ifTSl appears to provide the best compromise between time and space complexity. Gentilini 
et al.'s algorithm runs in 0(|P s j m | 2 |->|)-time, namely it remarkably improves on Bustan and Grumberg's 
algorithm and is not directly comparable with Tan and Cleaveland's algorithm, while the space complexity 
0(|Psim| 2 + E| log |P S im|) is the same of Bustan and Grumberg's algorithm and improves on Tan and 
Cleaveland's algorithm. Moreover, Gentilini et al. show experimentally that in most cases their procedure 
improves on Tan and Cleaveland's algorithm both in time and space. 

Main Contributions. This work presents a new efficient simulation algorithm, called SA, that runs in 
0(|P s im||-H)-time and 0(|P S j m ||E log E|)-space. Thus, while retaining an acceptable space complexity 
that is in general less than quadratic in the size of the state space, our algorithm improves the best known 
time bound. 

Let us recall that a relation R between states is a simulation if for any s, s' G E such that (s, s') G R, 
£(s) — i{s') and for any t G E such that s^t, there exists t' G E such that s'->t' and (t, t') G R. Then, 
s' simulates s, namely the pair (s, s') belongs to the simulation preorder P S i m , if there exists a simulation 
relation R such (s, s') G R. Also, ,s and s' are simulation equivalent, namely they belong to the same block 
of the simulation partition P S j m , if s' simulates s and vice versa. 

Our simulation algorithm SA is designed as a modification of Henzinger, Henzinger and Kopke's [23] 
algorithm, here denoted by HHK. The space complexity of HHK is in 0(|E| 2 log |E|). This is a con- 
sequence of the fact that HHK computes explicitly the simulation preorder, namely it maintains for any 
state s G E a set of states Sim(s) C E, called the simulator set of s, which stores states that are currently 
candidates for simulating s. Our algorithm SA computes instead a symbolic representation of the simu- 
lation preorder, namely it maintains: (i) a partition P of the state space E that is always coarser than the 
final simulation partition P S j m and (ii) a relation Rel C P x P on the current partition P that encodes 
the simulation relation between blocks of simulation equivalent states. This symbolic representation is 
the key both for obtaining the 0(|P S im||H) tmie bound and for limiting the space complexity of SA in 
0(|P s im||E| log |E|), so that memory requirements may be lower than quadratic in the size of the state 
space. 

The basic idea of our approach is to investigate whether the logical structure of the HHK algorithm may 
be preserved by replacing the family of sets of states S = {Sim(s)} sg s with the following state partition 
P induced by S: two states si and S2 are equivalent in P iff for all s G E, si G Sim(s) O S2 € Sim(s). 
Additionally, we store and maintain a preorder relation Rel C P x P on the partition P that gives rise to 
a so-called partition-relation pair (P, Rel). The logical meaning of this data structure is that if B, C G P 
and (B,C) G Rel then any state in C is currently candidate to simulate each state in B, while two states 
Si and S2 in the same block B are currently candidates to be simulation equivalent. Hence, a partition- 
relation pair (P, Rel) represents the current approximation of the simulation preorder and in particular P 
represents the current approximation of simulation equivalence. It turns out that the information encoded 
by a partition-relation pair is enough for preserving the logical structure of HHK. In fact, analogously to 
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the stepwise design of the HHK procedure, this approach leads us to design a basic procedure BasicSA 
based on partition-relation pairs which is then refined twice in order to obtain the final simulation algorithm 
SA. The correctness of SA is proved w.r.t. the basic algorithm BasicSA and relies on abstract interpretation 
techniques |[T2l[T3l . More specifically, we exploit some previous results [27] that show how standard strong 
preservation of temporal languages in abstract Kripke structures can be generalized by abstract interpreta- 
tion and cast as a so-called completeness property of abstract domains. On the other hand, the simulation 
algorithm SA is designed as an efficient implementation of the basic procedure BasicSA where the sym- 
bolic representation based on partition-relation pairs allows us to replace the size |E| of the state space in 
the time and space bounds of HHK with the size |P s i m of the simulation partition in the corresponding 
bounds for SA. 

Both HHK and SA have been implemented in C++. This practical evaluation considered benchmarks 
from the VLTS (Very Large Transition Systems) suite 1 30 1 and some publicly available Esterel programs. 
The experimental results showed that SA outperforms HHK. 

2 Background 
2.1 Preliminaries 

Notations. Let X and Y be sets. If S C X and X is understood as a universe set then -<S = X \ S. 
If / : X — > Y then the image of / is denoted by img(/) = {f(x) G Y | x G X}. When writing a set S 
of subsets of a given set of integers, e.g. a partition, S is often written in a compact form like {1, 12, 13} 
or {[1], [12], [13]} that stands for {{1}, {1,2}, {1,3}}. If R C X X X is any relation then R* C X x X 
denotes the reflexive and transitive closure of R. Also, if x G X then R(x) = {x 1 G X | (x, x') G R}. 

Orders. Let (Q,<) be a poset, that may also be denoted by Q<. We use the symbol C to denote 
pointwise ordering between functions: If X is any set and f,g : X — > Q then / C g if for all x G X, 
fix) < g[x). If S C Q then max(5) = {x G S \ My G S. x < y => x — y} denotes the set of 
maximal elements of S in Q. A complete lattice C< is also denoted by (C, <, V, A, T, _L) where V, A, 
T and _L denote, respectively, lub, gib, greatest element and least element in C. A function / : C — » D 
between complete lattices is additive when / preserves least upper bounds. Let us recall that a reflexive 
and transitive relation R C X x X on a set X is called a preorder on X. 

Partitions. A partition P of a set E is a set of nonempty subsets of E, called blocks, that are pairwise 
disjoint and whose union gives E. Part(E) denotes the set of partitions of E. If P G Part(E) and s G E 
then P(s) denotes the block of P that contains s. Part(E) is endowed with the following standard partial 
order ^: Pi < P 2 , i.e. P 2 is coarser than Pi (or Pi refines P 2 ) iff MB G Pi. 35' G P 2 . B C B'. If 
Pi, P2 G Part(E), Pi ^ P2 and P G Pi then parent p ^(P) (when clear from the context the subscript 
P2 may be omitted) denotes the unique block in P 2 that contains P. For a given nonempty subset SCS 
called splitter, we denote by Split (P, S) the partition obtained from P by replacing each block B G P 
with the nonempty sets Bf)S and B \ S, where we also allow no splitting, namely Split (P, S) = P (this 
happens exactly when 5 is a union of some blocks of P). 

Kripke Structures. A transition system (E, -+) consists of a set E of states and a transition relation 
-> C E x E. The relation -» is total when for any s G E there exists some £ G E such that s->i. 
The predecessor/successor transformers pre^ , post^ : p(S) — ► p(E) (when clear from the context the 
subscript -> may be omitted) are defined as usual: 

- pre^(F) = {a G E | 3b G Y. a^6}; 

- post^(Y) = {& G E I 3a G Y. a^6}. 

Let us remark that pre^ and post^ are additive operators on the complete lattice p(E)c. If Si, S2 C E 
then Si^ 33 ^ iff there exist si G Si and s 2 G ^2 such that Si->S2. 
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Given a set AP of atomic propositions (of some specification language), a Kripke structure % = 
(£, £) over AP consists of a transition system (£, ->) together with a state labeling function £ : £ — > 
p(^4P). A Kripke structure is called total when its transition relation is total. We use the following notation: 
for any s G S, [s]^ = f {s' 6 £ | £(s) = £(s')} denotes the equivalence class of a state s w.r.t. the labeling 
£, while P £ = {[s]t | s G E} £ Part(E) is the partition induced by I. 

2.2 Simulation Preorder and Equivalence 

Recall that a relation B C E x S is a simulation on a Kripke structure % = (£, ->, £) over a set .AP of 
atomic propositions if for any s, s' € £ such that (s, s') 6 P: 

(a) £(s) = £(s'); 

(b) For any < £ £ such that s^t, there exists t' 6 £ such that s'-»i' and (<, £') G P. 

If (s, s') G P then we say that s' simulates s. The empty relation is a simulation and simulation relations 
are closed under union, so that the largest simulation relation exists. It turns out that the largest simulation 
is a preorder relation called simulation preorder (on %) and denoted by P S i m . Simulation equivalence 
~sim C £ x £ is the symmetric reduction of P S i m , namely ^ S im= Psim H P^- P s im & Part(S) denotes 
the partition corresponding to ^ S i m and is called simulation partition. 

It is a well known result in model checking [ 14, 22, 25 1 that the reduction of % w.r.t. simulation equiva- 
lence ~si m allows us to define an abstract Kripke structure A s \ m = (P S im, £ 3 ) that strongly preserves 
the temporal language ACTL*, where: P S j m is the abstract state space, is the abstract transition relation 
between simulation equivalence classes, while for any block B G P s im, ^{B) = £(s) for any represen- 
tative s G B. It turns out that A s i m strongly preserves ACTL*, i.e., for any ip G ACTL*, B G P S i m and 
s G B, we have that s \= x <p if and only if B |=- Aai ™ tp. 

2.3 Abstract Interpretation 

Abstract Domains as Closures. In standard abstract interpretation, abstract domains can be equivalently 
specified either by Galois connections/insertions or by (upper) closure operators (uco's) ff3l . These two 
approaches are equivalent, modulo isomorphic representations of domain's objects. We follow here the 
closure operator approach: this has the advantage of being independent from the representation of do- 
main's objects and is therefore appropriate for reasoning on abstract domains independently from their 
representation. 

Given a state space S, the complete lattice p(£)c plays the role of concrete domain. Let us recall that 
an operator /i : p(E) — > (p(£) is a uco on p(S), that is an abstract domain of p(S), when /i is monotone, 
idempotent and extensive (viz., X C /j,(X)). It is well known that the set uco(p(E)) of all uco's on p(S), 
endowed with the pointwise ordering C, gives rise to the complete lattice (uco(p(E)), C, U, AA.S, id) 
of all the abstract domains of p(S). The pointwise ordering □ on uco(p(S)) is the standard order for 
comparing abstract domains with regard to their precision: \l\ C fi 2 means that the domain /ii is a more 
precise abstraction of p(S) than /12, or, equivalently, that the abstract domain ^1 is a refinement of /i2- 

A closure /i G uco(p(S)) is uniquely determined by its image img(/i), which coincides with its set of 
fixpoints, as follows: [i = XY. n {X G img(/i) | Y C X}. Also, a set of subsets X C p(S) is the image of 
some closure operator fix G uco(p(£)) iff X is a Moore-family of p(S), i.e., X = Cl n (X) = f {C\S \ S G 
X} (where D0 = E G Cln(X)). In other terms, X is a Moore-family (or Moore-closed) when X is closed 
under arbitrary intersections. In this case, /ix = XY. n {X G X | Y G X} is the corresponding closure 
operator. For any X G p(S), Cln(X) is called the Moore-closure of X, i.e., Cl n (X) is the least set of 
subsets of E which contains all the subsets in X and is Moore-closed. Moreover, it turns out that for 
any \i G uco(p(E)) and any Moore-family X G p(E), ^img(p) = M an d i m g(^x) = X. Thus, closure 
operators on p(E) are in bijection with Moore-families of p(E). This allows us to consider a closure 
operator /1 G uco(p(E)) both as a function /1 : p(E) — > p(E) and as a Moore-family img(/x) G p(E). 
This is particularly useful and does not give rise to ambiguity since one can distinguish the use of a closure 
H as function or set according to the context. 
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Abstract Domains and Partitions. As shown in l27l . it turns out that partitions can be viewed as par- 
ticular abstract domains. Let us recall here that any abstract domain /x G uco(p(E)) induces a partition 
par(/x) G Part(E) that corresponds to the following equivalence relation = M on E: 

x = M y iff MM) = MM)- 

Example 2.1. Let E = {1,2,3,4} and consider the following abstract domains in uco(p(E)) that are 
given as intersection-closed subsets of p(E): p = {0, 3, 4, 12, 34, 1234}, p! = {0, 3, 4, 12, 1234}, p" = 
{12,123,124, 1234}. These abstract domains all induce the same partition P = {[12], [3], [4]} G Part(E). 
For example, /i"({l}) = p"{{2}) = {1,2}, p"({3}) = {1, 2, 3}, p"({4}) = {1,2,4} so that par (//') = 
P. □ 



Forward Completeness. Let us consider an abstract domain p G uco(p(£)c), a concrete semantic 
function / : p(E) — > p(E) and a corresponding abstract semantic function f* : /x — * p (for simplic- 
ity of notation, we consider 1-ary functions). It is well known that the abstract interpretation (p, /") is 
sound when / o /x C f* o p holds: this means that a concrete computation f(p(X)) on an abstract object 
p(X) is correctly approximated in p by /"(MX)), that is, f{p(X)) C f*(p(X)). Forward complete- 
ness corresponds to require the following strengthening of soundness: (/x, /") is forward complete when 
/ o p = p o p: The intuition here is that the abstract function /" is able to mimic / on the abstract domain 
p with no loss of precision. This is called forward completeness because a dual and more standard notion 
of backward completeness may also be considered (see e.g. lfl9lD . 

Example 2.2. As a toy example, let us consider the following abstract domain Sign for representing the 
sign of an integer variable: Sign = {0, Z< , 0, Z> , Z} G uco(p(Z)c). The concrete pointwise addition 
+ : p(Z) x p(Z) — > p(Z) on sets of integers, that is X + Y d = {x + y \ x £ X, yd Y}, is approximated 
in Sign by the abstract addition -\- Sl s n : Sign x Sign — > Sign that is defined as expected by the following 
table: 
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It turns out that (Sign, -|- Sl f n ) is forward complete, i.e., for any oi, &2 G S'igri, ai +02 = ai + lS,s ™ a2- □ 

It turns out that the possibility of defining a forward complete abstract interpretation on a given abstract 
domain [i does not depend on the choice of the abstract function /" but depends only on the abstract 
domain /1. This means that if /") is forward complete then the abstract function /" indeed coincides 
with the best correct approximation \l o / of the concrete function / on the abstract domain Hence, for 
any abstract domain /j, and abstract function /», it turns out that (/x, /») is forward complete if and only if 
(/j,, n o /} is forward complete. This allows us to define the notion of forward completeness independently 
of abstract functions as follows: an abstract domain /i G uco(p(S)) is forward complete for / (or forward 
/-complete) iff / o /j, — /j, o f o fi. Let us remark that /x is forward /-complete iff the image img(/x) is 
closed under applications of the concrete function /. If F is a set of concrete functions then /1 is forward 
complete for F when /1 is forward complete for all / G F. 



Forward Complete Shells. It turns out |[T9]|23 that any abstract domain fi G uco(p(E)) can be refined 
to its forward ^-complete shell, namely to the most abstract domain that is forward complete for F and 
refines /x. This forward F-complete shell of /x is thus defined as 

Sjr(/x) = U {p G uco(p(E)) I p C /j,, p is forward ^-complete}. 

Forward complete shells admit a constructive fixpoint characterization. Given /x G uco(p(E)), consider 
the operator F^ : uco(p(E)) — > uco(p(E)) defined by 

F„{p) = Cl n (/x U {f(X) I / G F, X G p}). 
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Thus, (p) refines the abstract domain p by adding the images of p for all the functions in F. It turns out 
that F^ is monotone and therefore admits the greatest fixpoint, denoted by gip(F fl ), which provides the 
forward incomplete shell of p: §f(m) — gfp(-F)i)- 

Disjunctive Abstract Domains. An abstract domain p G uco(p(E)) is disjunctive (or additive) when 
p is additive and this happens exactly when the image img(p) is closed under arbitrary unions. Hence, 
a disjunctive abstract domain is completely determined by the image of p on singletons because for any 
XCE, jt-tpO = U x6 xM{ x })- The intuition is that a disjunctive abstract domain does not lose precision 
in approximating concrete set unions. We denote by uco d (p(S)) C uco(p(E)) the set of disjunctive 
abstract domains. 

Given any abstract domain p G uco(p(£)), it turns out IIT31 l20l that p can be refined to its disjunc- 
tive completion p d : this is the most abstract disjunctive domain p d G uco d (p(S)) that refines p. The 
disjunctive completion p d can be obtained by closing the image img(/i) under arbitrary unions, namely 
img(^ d ) = Clu(img(/i)) = {U§ | § C img(p)}, where U0 = G Clu(img(/i)). 

It turns out that an abstract domain p is disjunctive iff p is forward complete for arbitrary concrete set 
unions, namely, p is disjunctive iff for any {Xi}i e j C p(S), Uj e j/x(Xj) = p(Ui^ip(Xi)). Thus, when 
E is finite, the disjunctive completion p d of p coincides with the forward U-complete shell §u(a*) of p. 
Also, since the predecessor transformer pre^ preserves set unions, it turns out that the forward complete 
shell Sy.prc^ (m) f° r {U, pre^} can be obtained by iteratively closing the image of p, under pre^ and then 
by taking the disjunctive completion, i.e., §u, P rc^ (p) = §u(§ P rc^ (m))- 

Example 2.3. Let us consider the abstract domain p = {0, 3, 4, 12, 34, 1234} in Example 12. II We have 
that p is not disjunctive because 12, 3 G p while 12 U 3 = 123 G" p. The disjunctive completion p d is 
obtained by closing p under unions: p d = {0,3, 4, 12, 34, 123, 124, 1234}. □ 

Some Properties of Abstract Domains. Let us summarize some easy properties of abstract domains that 
will be used in later proofs. 

Lemma 2.4. Let p G uco(p(E)), p G uco d (p(E)), P,Q G Part(E) such that P -< par(/n) and Q ^ 
par(p). 

(i) For any B G P, p(B>) — /i(parent par ( M ) 

(ii) For any X G p(£), p(X) = U{B eP\B C p(X)}. 

(iii) For any X G p(E), p(X) = U{p(B) \ B G Q, B n X ^ 0}. 

(iv) par(/i) = par(/z d ). 

Proof, (i) In general, by definition of par(/x), for any C G pax(p) and S C C, p(S) = p{C). Hence, 
since B C parent par ( M j (S) we have that p(B) = ^(parent par ( M )(-B)). 

(ii) Clearly, D U{S eP|BC ^(1)}. On the other hand, given z G p{X), let B z e Pbe the 

block in P that contains z. Then, B z C = ^(M) £ /"P0> so that z S U{B eP|BC 

(iii) 

p(A) = [as p is additive] 

U{p{{x}) \ xeX}= [as Q < par(p)] 
U{p(B x ) i £ I, £ i G = 

u{p(B) BeQ, Bni / 0}. 

(iv) Since /i d C p, we have that par(/x d ) < par(/i). On the other hand, if B G par(/^) then for all x £ B, 
p d ({x}) = p({x}) = p(B), so that B G par(> d ). □ 
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3 Simulation Preorder as a Forward Complete Shell 



Ranzato and Tapparo [27] showed how strong preservation of specification languages in standard abstract 
models like abstract Kripke structures can be generalized by abstract interpretation and cast as a forward 
completeness property of generic abstract domains that play the role of abstract models. We rely here on 
this framework in order to show that the simulation preorder can be characterized as a forward complete 
shell for set union and the predecessor transformer. Let % = (£, £) be a Kripke structure. Recall that 
the labeling function £ induces the state partition Pg = {[s]e \ s £ £}. This partition can be made an 
abstract domain pg £ uco(p(£)) by considering the Moore-closure of P# that simply adds to Pi the empty 
set and the whole state space, namely pg = G\ n ({[s]g | s £ E}). 

Theorem3.1. Let fix — §u.prc(/^) be the forward {U, pre}-com/?Zefe shell of pg. Then, i? s im = £ 
£ x £ | s' £ px({s}) and P sim = par(^3c). 

Proof. Given a disjunctive abstract domain /i £ uco d (p(£)), define = {(s,s') £ £x£ | s' £ p({s})}. 
We prove the following three preliminary facts: 

(1) p is forward complete for pre iff satisfies the following property: for any s,t, s' £ £ such 
that s — ► t and (s, s') £ R^ there exists t' € £ such that s' — > t' and (t,t') £ R^. Observe 
that the disjunctive closure p is forward complete for pre iff for any s, t £ £, if s £ pre(/x({t})) 
then /i({s}) C pre(/i({£})), and this happens iff for any s,i E £, if s £ pre({i}) then p({s}) C 
pre(/i({i})). This latter statement is equivalent to the fact that for any s, s',t 6 S such that s — ► t 
and s' £ p({s}), there exists t' £ p({t}) such that s' — > t', namely, for any s, s',f 6 S such that 
s — ► f and (s. s') 6 there exists i'eE such that (t, t 1 ) £ R^ and s' — > t'. 

(2) /j □ /i^ iff i? p satisfies the property that for any s, s' £ E, if (s, s') G i? M then £(s) — £(s'): In fact, 
pQ pg <^> Vs e E. p({s}) C ^({s}) = [s]^ <^> Vs,s' G E. (s' £ p({s}) implies s' £ [s]g) 
Vs, s' € E. ((s, s') e R,j, implies £(s) = -^(s')). 

(3) Clearly, given e uco d (p(E)), ^ C A*' iff R^ C i?^. 

Let us show that i? M;K = i? s i m . By definition, px is the most abstract disjunctive closure that is forward 
complete for pre and refines pg . Thus, by the above points (1) and (2), it turns out that R^ x is a simulation 
on %. Consider now any simulation S on 3C and the function p' = post s „ : p(£) — > p(E). Let us notice 
that p! £ uco d (p(E)) and S C S* = R^. Also, the relation S* is a simulation because S is a simulation. 
Since S* is a simulation, we have that R^> satisfies the conditions of the above points (1) and (2) so that p! 
is forward complete for pre and p! C pg. Moreover, p' is disjunctive so that p! is also forward complete 
for U. Thus, p! C §u,pre(A t f) — Mac- Hence, by point (3) above, R^ C R^ so that S* C i? M3C . We have 
therefore shown that Rp x is the largest simulation on %. 

The fact that P S j m = parage) comes as a direct consequence because for any s,t £ E, s ^ S im t iff 
(s,t) £ i? s j m and (i, s) £ i? S im- From i?^^ = R sim we obtain that s ^ S i m t iff s 6 pjc({t}) and 
t G /X3c({s}) iff M3c({s}) = Mac ({*})■ This holds iff s and t belong to the same block in pa,r(px). □ 

Thus, the simulation preorder is characterized as the forward complete shell of an initial abstract do- 
main pi induced by the labeling £ w.r.t. set union U and the predecessor transformer pre while simulation 
equivalence is the partition induced by this forward complete shell. Let us observe that set union and the 
predecessor pre provide the semantics of, respectively, logical disjunction and the existential next operator 
EX. As shown in l27l . simulation equivalence can be also characterized in a precise meaning as the most 
abstract domain that strongly preserves the language 



Example 3.2. Let us consider the Kripke structure % depicted below where the atoms p and q determine 
the labeling function £. 



ip ::= atom \ ipi A </?2 | V ip2 EX<^. 




It is simple to observe that P S j m = {1, 2, 3, 4} because: (i) while 3^4 we have that 1,2^ pre(4) so that 1 
and 2 are not simulation equivalent to 3; (ii) while we have that 2 ^ prc(12) so that 1 is not simulation 
equivalent to 2. 

The abstract domain induced by the labeling is \m = {0, 4, 123, 1234} G uco(p(E)). As observed above, 
the forward complete shell §u,pre(w) — Su(Spre(/^ )) so that this domain can be obtained by iteratively 
closing the image of ne under pre and then by taking the disjunctive completion: 

- /i = fif, 

- m = Cl n (Mo Upre(^o)) = Cl n (Mo U {prc(0) = 0, prc(4) = 34, pre(123) = 12, pre(1234) = 
1234}) = {0, 3, 4, 12, 34, 123, 1234}; 

- M2 = Cl n (Mi U pre(/ii)) = Cl n (Mi U {pre(3) = 12, pre(12) = 1, pre(34) - 1234}) = 
{0,1,3,4,12,34,123,1234}; 

- /x 3 = Cl n (M2 U pre(/i 2 )) = fi 2 (fixpoint). 

^>u,pie(ne) is thus given by the disjunctive completion of i.e., Su, P re(/-^) = {0, 1, 3, 4, 12, 13, 14, 34, 
123, 124, 134, 1234} = fi X - Note that /i X (l) = 1, p, x (2) = 12, /l x (3) = 3 and p, x {A) = 4. Hence, 
by Theorem O the simulation preorder is P sim = {(1, 1), (2,2), (2, 1), (3,3), (4,4)}, while P sim = 
par(S UlP re(w)) = {1,2,3,4}. □ 

Theorem l3.1l is one key result for proving the correctness of our simulation algorithm S A while it is not 
needed for understanding how SA works and how to implement it efficiently. 



4 Partition-Relation Pairs 

Let P G Part(S) and R C P x P be any relation on the partition P. One such pair (P, R) is called a 
partition-relation pair. A partition-relation pair (P, R) induces a disjunctive closure f^/p t R) G uco d (p(S) c ) 
as follows: for any X 6 £>(£), 

M<P,ii>(X)= U{CGP|3BGP.BnA^0,(S,C) GP*}. 

It is easily shown that [i/p t R\ is indeed a disjunctive uco. Note that, for any B 6 P and .t G P, 

M<P,fl>(M) = = UP*(P) = U{C G P | (P,C) G P*}. 

This correspondence is a key logical point for proving the correctness of our simulation algorithm. 
In fact, our algorithm maintains a partition-relation pair, where the relation is a preorder, and our proof 
of correctness depends on the fact that this partition-relation pair logically represents a corresponding 
disjunctive abstract domain. 

Example 4.1. Let S = {1, 2, 3, 4}, P = {12, 3, 4} G Part(S) and P = {(12, 3), (3, 4), (4, 3)}. Note 
thatP* = {(12, 12), (12,3), (12,4), (3,3), (3,4), (4,3), (4,4)}. The disjunctive abstract domain n {PM) 
is such that fi {PiR) ({l}) = M<p,ii>({2}) = {l,2,3,4}and/i <PiH) ({3}) = m<p,k>({4}) = {3,4}, so that 
the image of M<F,fl) is {0, 34, 1234}. □ 

On the other hand, any abstract domain fi G uco(p(S)) induces a partition-relation pair (P^, P p ) as 
follows: 

- P M = par(^); 

- P M = {(P, C) G P p x P M | C C M (P)}. 

The following properties of partition-relation pairs will be useful in later proofs. 
Lemma 4.2. Let (P, P) be a partition- relation pair and /i G uco(p(S)). 
(i) P < par(>( FiH )). 
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(ii) (P^R^) = (P^R^). 

Proof, (i) We already observed above that if B E P and x E B then (J-tp,m({x}) — fi/p t R\(B), so that 
B C {y e E | /Li(p, R )({a;}) = /«<p,ii>({y})} which is ablockin pax(/i( Fjfl )). 
(ii) By Lemma l24l (iv). P M = par(/i) = par(/z d ) = P„d. Moreover, 

P^ = [by definition] 

{(B, CJeP.xPjCC M (P)} = [as P M = P^ d ] 

{(B, C) E i> x P^d | C C M (P)} = [as M (P) = //(P)] 

{(P, C) E P pd x P^d | C C //(P)} = [by definition] 

P M d. □ 

It turns out that the above two correspondences between partition-relation pairs and disjunctive abstract 
domains are inverse of each other when the relation is a partial order. 

Lemma 4.3. For any partition P E Part(E), partial order R C P x P and disjunctive abstract domain 
/J, E uco d (p(E)), we have that (P^ P>R> , -R m <p,r> ) = ( P > R ) and M(P^P M ) = M- 

Proof. Let us show that (P M(Pfi> , Rjj, <p , r) } = (-P, -R)- We first prove that P M(PH> = P, i.e. par(/i( Piii )) = 
P. On the one hand, by Lemma l-OK i). P ^ par^mm). On tne other hand, if x, y E E, M(P,ii) ({^}) = 
M(P,fl.)({y}) and x £ B x e P and y e B y e P then (B x ,B y ) E P* and (B y ,B x ) E P*. Since P is a 
partial order, we have that P* = P is a partial order as well, so that B x = B y , namely par(/i/f> m ) ^ -P. 
Let us prove now that P M(P R) = P. In fact, for any (P, C) E par(/i(p jF ) ) x par(/i(pp) ), 

(P, C) E i?^ p n) ^ [by definition of P M<PR) ] 
C C M(pp) ^ [by definition of h^r)] 
(P, C) E P* ^ [since P* = P] 
(P,C) E P. 

Finally, let us show that fi(p t R ) = H- Since both fi(p t R ) and /i are disjunctive it is enough to prove that 
for all x E E, /J,(p ,r ) {{x}) = Given x E E consider the block P^ E P M = par(/x) containing 

x. Then, 

M<p m ,p^({z}) = [by definition of M<P M ,fl M >] 
U{C E P M | (P x , C) E P;} = [since P* = P M ] 
U{C* E P M | (P X ,C) E P M } = [by definition of P M ] 
U{C E P M | C C ^(P*)} = [by Lemma|23](ii)] 

/i(P x ) = [since /Lt(P a ) = M^})] 

Our simulation algorithm relies on the following condition on a partition-relation pair (P, P) w.r.t. a 
transition system (E, -*) which guarantees that the corresponding disjunctive abstract domain H{p,r) is 
forward complete for the predecessor pre. 

Lemma 4.4. Let (E, ->) be a transition system and (P, P) be a partition- relation pair where R is reflexive. 
Assume that for any B ,C E P, if C (~l pre(P) ^ ?/ien UP(C) C pre(UP(P)). TTzen, fJ-(p.R) is forward 
complete for pre. 

Proof. We preliminarily show the following fact: 

(\) Let /i E uco d (p(E)) and P E Part(E) such that P ^ par(/x). Then, p, is forward complete for pre 
iff for any P, C E P, if C* n pre(P) + then /x(C) C pre(/i(P)). 
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(=>) Let B,C G P such that CTlpre(P) ^ 0. Since B C ^(P) we also have that CTlpre(/x(P)) ^ 
0. By forward completeness, pre(/z(P)) = /Lt(pre(/x(P)). Since P < par(/t), C E P and C H 
//(pre(/x(P))) = C n pre(/i(P)) ^ we have that C C /x(pre(/z(P))) = pre(/x(P)), so that, by 
applying the monotone map fi, /i(C) C //(pre(/i(P))) = pre(/i(P)). 

(<S=) Firstly, we show the following property (*): for any B, C G P, if C n pre(/i(P)) ^ then 
C pre(/z(P)). Since P -< par(/x), by Lemma IZ4l(ii), C n pre(^(P)) = Cn pre(U{P G 
P | fl C /i(-B)}). so that if C n pre(/z(P)) ^ then C D pre(P) ^ for some D E P such 
that 13 C fi(B). Hence, by hypothesis, fi(C) C pre(/i(P)). Since /i(P) C /i(P), we thus obtain 
that /i(C) C pre(/x(P)) C pre(^(P)). Let us now prove that /j, is forward complete for pre. We 
first show the following property (**): for any B G P, /x(pre(/i(P))) C pre(/x(P)). In fact, since 
P ^ par(/i), we have that: 



/Lt(pre(/x(B))) = [by Lemma I2~4l (iii) because /i is additive] 
U{^(C) | C E P, C n pre(/i(B)) ^ 0} C [by the above property (*)] 

pre(/i(S)). 

Hence, for any X G p(E), we have that: 

/i(pre(/x(X))) = [since, by Lemma l24l (iii). /i(A) = U,/^(-Bj) for some {P^} C P] 
/i(pre(Uj/^(-Bj))) = [since /i and pre are additive] 
UijLi(pre(/i(Sj))) C [by the above property (**)] 

Ui pre(/z(Pj)) = [since pre is additive] 

pre(Ui/x(I?i)) = [since fi(X) = Uj/x(Pj)] 
pre( M (X)). 

Let us now turn to show the lemma. By Lemma l4~2l (i). we have that P ^< par(/Z(p^). By the above 
fact (I), in order to prove that M<F,fl> i s forward complete for pre it is sufficient to show that for any 
B, C G P, if C ("1 pre(P) ^ then /i(p,H)(C) C pre([i( PtR ) (B)). Thus, assume that C ("1 pre(P) 7^ 0. 
We need to show that UP*(C) C pre(UP*(P)). Assume that (C,D) G R*, namely that there exist 
{Bi}ie[o.k] Q P7 for some k > 0, such that P = C, P& = D and for any i G [0, fc), B i+1 ) G P. We 
show by induction on k that D C pre(UP*(P)). 

(fc = 0) This means that C = D. Since P is assumed to be reflexive, we have that (C, C) G P. By hypoth- 
esis, UR(C) C pre(UP(P)) so that we obtain D = C C UR(C) C pre(UP(B)) C pre(UP*(P)). 

(fc + 1) Assume that (C, Si), (Bx,B 2 ), (B k ,D) G P. By inductive hypothesis, P fc C pre(UP*(P)). 
Note that, by additivity of pre, pre(UP*(P)) = U{pre(P) | E G P, (P,P) G P*}. Thus, there 
exists some P G P such that (P, P) G P* and B k Hpre(P) ^ 0. Hence, by hypothesis, UP(P fe ) C 
pre(UP(P)). Observe that UP(P) C UP*(P) C UP*(P) so that P C UP(P fc ) C pre(UP(P)) C 
pre(UP*(P)). □ 



5 Henzinger, Henzinger and Kopke's Algorithm 

Our simulation algorithm SA is designed as a symbolic modification of Henzinger, Henzinger and Kopke's 
simulation algorithm l23ll . This algorithm is designed in three incremental steps encoded by the procedures 
SchematicSimilarity , RefinedSimilarity and HHK (called EfficientSimilarity in B23ID in Figure[T] 

Consider any (possibly non total) finite Kripke structure (E, The idea of the basic SchematicSimilarity 
algorithm is simple. For each state v G E, the simulator set Sim(v) C E contains states that are candidates 
for simulating v. Hence, Sim(v) is initialized with all the states having the same labeling as v, that is [v]g. 
The algorithm then proceeds iteratively as follows: if u-*v, w G Sim(u) but there is no w' G Sim(v) such 
that w->w' then w cannot simulate u and therefore Sim(u) is refined to Sim(u) \ {w}. 
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SchematicSimilarityQ { 
forall v G E do Sim(v) := [v]e; 

while 3it, u, w G S such that (u-^v & w£Sim(u) & post^({to}) n Sim{v) = 0) do 
|_ Sim(u) :— Sim(u) \ {w}; 

} 



RefinedSimilarityQ { 
forall uEEdo 

prevSimiv) \— E; 

_ if post({«}) = then Sim(v) := [t>] f ; else Stm(v) := [v] e n pre(E); 

while 3» 6 E such that Sim(v) / prevSim(v)) do 
/ / Invi: Vu G E. Sim(v) C prevSim(v) 
II Inv2: Vu,w 6 E. it^t; =>■ Sim(u) C pre(prevS'im(v)) 
Remove := pre{prevSim{v)) \ pre(S , im(v)); 
prevSim(v) :— Sim(v); 

forall it e pre(u) do Sim(u) :— Sim(u) s Remove; 



HHK() { 

// forall w G E do prevSim(v) := E; 
forall ueEdo 

ifpost({u}) = then Sim(v) := [v]*; else Stm(v) := [v]i npre(E); 
Remove(v) := pre(E) \ pre(5im(v)); 

while 3w G E such that Remove(v) ^ do 

// I11V3: Vi> G E. Remove(v) = pre(preiiSim(ii)) \ pre(Sim(v)) 
/ / prevSim(v) := Sim(v); 
Remove := Remove(v); 
Remove(v) :— 0; 
forall w G pre(w) do 

forall w G Remove do 
if w G Sim{u) then 

Sim(u) := Sim(u) \ {w}; 

forall to" G prc(ro) such that w" ^ pre(Sim(u) do 
|_ Remove(u) := Remove(u) U {w"}; 

} 



Figure 1: HHK Algorithm. 
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This basic procedure is then refined to the algorithm RefinedSimilarity . The key point here is to 
store for each state n e San additional set of states prevSim(v) that is a superset of Sim(v) (invariant 
Invi) and contains the states that were in Sim(v) in some past iteration where v was selected. If u^v 
then the invariant I11V2 allows to refine Sim{u) by scrutinizing only the states in pre(prevSim(v)) instead 
of all the possible states in E: In fact, while in SchematicSimilarity , Sim(u) is reduced to Sim(u) \ 
(E \ pre(Sim(v)), in RefinedSimilarity, Sim(u) is reduced in the same way by removing from it the 
states in Remove = pre(prevSim(v)) \ pre(Sim(v)) . The initialization of Sim(v) that distinguishes 
the case post({u}) = allows to initially establish the invariant Inv2. Let us remark that the original 
RefinedSimilarity algorithm presented in [23] contains the following bug: the statement prevSim(v) := 
Sim(v) is placed just after the inner for-loop instead of immediately preceding the inner for-loop. It turns 
out that this is not correct as shown by the following example. 

Example 5.1. Let us consider the Kripke structure in Example l3.2l We already observed that the simulation 
relation is i? S j m = {(1, 1), (2, 2), (2, 1), (3, 3), (4, 4)}. However, one can check that the original version of 
the RefinedSimilarity algorithm in ll23l — where the assignment prevSim(v) := Sim(v) follows the inner 
for-loop — provides as output Sim(l) = {1, 2}, Sim(2) = {1, 2}, Sim(2>) = {3}, Sim(A) = {4}, namely 
the state 2 appears to simulate the state 1 while this is not the case. The problem with the original version 
in ||231 of the RefinedSimilarity algorithm lies in the fact that when v £ pre({w}) — like in this example 
for state 1 — it may happen that during the inner for-loop the set Sim{v) is refined to Sim(v) \ Remove 
so that if the assignment prevSim(v) :— Sim(v) follows the inner for-loop then prevSim(v) might be 
computed as an incorrect subset of the right set. □ 

RefinedSimilarity is further refined to the final HHK algorithm. The idea here is that instead of 
recomputing at each iteration of the while-loop the set Remove := pre(prevSim(v)) \ pre(Sim(v)) for 
the selected state v, a set Remove(v) is maintained and incrementally updated for each state v £ E in 
such a way that it satisfies the invariant I11V3. The original version of HHK in [23| also suffers from a 
bug that is a direct consequence of the problem in RefinedSimilarity described above: within the main 
while-loop of HHK, the statement Remove(v) :— is placed just after the outermost for-loop instead of 
immediately preceding the outermost for-loop. It is easy to show that this is not correct by resorting again 
to Example |5T| 

The implementation of HHK exploits a matrix Count(u,v), indexed on states u,v £ E, such that 
Count(u,v) = I post(u) n Sim(v)\, i.e., Count(u,v) stores the number of transitions from u to some 
state w £ Sim(v). Hence, the test w" £" pre(Sim(u)) in the innermost for-loop can be done in 0(1) by 
checking whether Count {w" , u) is or not. This provides an efficient implementation of HHK that runs 
in 0(|E||->|) time, while the space complexity is in 0(|E| 2 log |E|), namely it is more than quadratic in 
the size of the state space. Let us remark that the key property for showing the 0(|E| |->|) time bound is as 
follows: if a state v is selected at some iterations i and j of the while-loop and the iteration i precedes the 
iteration j then Remove^v) fl Removejiv) = 0, so that the sets in {Remove^v) \ v is selected at some 
iteration i } are pairwise disjoints. 

6 A New Simulation Algorithm 

6.1 The Basic Algorithm 

Let us consider any (possibly non total) finite Kripke structure (E,-^,£). As recalled above, the HHK 
procedure maintains for each state s £ E a simulator set Sim(s) C E and a remove set Remove(s) £ 
E. The simulation preorder i? S i m is encoded by the output {Sim(s)} s ^s as follows: (s, s') £ i? S i m 
iff s' £ Sim(s). Hence, the simulation equivalence partition P smi is obtained as follows: s and s' are 
simulation equivalent iff s £ Sim(s') and s' £ Sim(s). Our algorithm relies on the idea of modifying the 
HHK procedure in order to maintain a partition-relation pair (P, Rel) in place of {Sim(s)} s ^s, together 
with a remove set Remove(B) £ E for each block B £ P. The basic idea is to replace the family of 
sets S = {5im(s)} s6 £ with the following state partition P induced by §: s\ ~§ ^2 iff for all s £ E, 
s% £ Sim(s) S2 £ Sim(s). Then, a reflexive relation Rel £ P x P on P gives rise to a partition- 
relation pair where the intuition is as follows: given a state s and a block B £ P (i) if s £ B then the 
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1 HasicSA(PartitionRelation (P, Rel)) { 


2 while 3B, C G P such that (C n pre(B) / & L)Rel(C) % pre(uRel(B))) do 


3 


S := pie(uRel(B)); 


4 


p p. D D. 


5 


P ■- Split(P,S); 


6 


forall CePdo Rel(C) := {D e P \ D C uRel (parent p (C*))}; 


7 

8 } 


forall C £ P such that C n pre(B prcv ) / do i?eZ(C) := {£> G Rel(C) DCS}; 


Figure 2: Basic Simulation Algorithm. 



current simulator set for s is a the union of blocks in P that are in relation with B, i.e. Sim(s) = L)Rel(B); 
(ii) if s, s' G B then s and s' are currently candidates to be simulation equivalent. Thus, a partition-relation 
pair (P, Rel) represents the current approximation of the simulation preorder and in particular P represents 
the current approximation of simulation equivalence. 

Partition-relation pairs have been used by Henzinger, Henzinger and Kopke's [23 1 to compute the sim- 
ulation preorder on effectively presented infinite transition systems, notably hybrid automata. Henzinger 
et al. provide a symbolic procedure, called SymbolicSimilarity in [23], that is derived as a symboliza- 
tion through partition-relation pairs of their basic simulation algorithm SchematicSimilarity in Figure Q] 
Moreover, partition-relation pairs are also exploited by Gentilini et al. |[T8l in their simulation algorithm 
for representing simulation relations. The distinctive feature of our use of partition-relation pairs is that, 
by relying on the results in Section |U we logically view partition-relation pairs as abstract domains and 
therefore we can reason on them by using abstract interpretation. 

Following Henzinger et al. l23ll . our simulation algorithm is designed in three incremental steps. We 
exploit the following results for designing the basic algorithm. 

- Theorem l3.1l tells us that the simulation preorder can be obtained from the forward {U, pre}-complete 
shell of an initial abstract domain fi£ induced by the labeling I. 

- As shown in Section|U a partition-relation pair can be viewed as representing a disjunctive abstract 
domain. 

- Lemma l4~4l gives us a condition on a partition-relation pair which guarantees that the corresponding 
abstract domain is forward complete for pre. Moreover, this abstract domain is disjunctive as well, 
being induced by a partition-relation pair. 

Thus, the idea consists in iteratively and minimally refining an initial partition-relation pair (P, Rel) 
induced by the labeling I until the condition of Lemma l4.4l is satisfied: for all B, C G P, 

C n pre(B) ^ =>- URel(C) C pre(l)Rel(B)). 

Let us observe that C n pre(B) ^ means that C-> 33 B. The basic algorithm, called BasicSA, is in 
Figure|2] The current partition-relation pair (P, Rel) is refined by the following three steps in BasicSA. If 
B is the block of the current partition P selected by the while-loop then: 

(i) the current partition P is split with respect to the set S — pre(URel(B)); 

(ii) if C is a newly generated block after splitting the current partition and parent P (C) is its par- 
ent block in the partition P prev before the splitting operation then Rel(C) is modified so as that 

l)Rel(C) = U#eZ(parent Fprev (C)); 

(iii) the current relation Rel is refined for the (new and old) blocks C such that C^ 33 i? by removing 
from Rel(C) those blocks that are not contained in S; observe that after having split P w.r.t. S it 
turns out that one such block D either is contained in S or is disjoint with S. 



13 



Let us remark that although the symbolic simulation algorithm for infinite graphs SymbolicSimilarity 
in (23 1 may appear similar to our BasicSA algorithm, it is instead inherently different due to the following 
reason: the role played by the condition: C^ 33 P & URel(C) % pre(UPeZ(P)) in the while-loop of 
BasicSA is played in SymbolicSimilarity by: C^ 33 l)Rel(B) k UPd(C) % pre(UPe/(P)), and this 
latter condition is computationally harder to check. 

The following correctness result formalizes that BasicSA can be viewed as an abstract domain refine- 
ment algorithm that allows us to compute forward complete shells for {U, pre}. For any abstract domain 
/j G uco(p(£)), we write // = BasicSA(^) when the algorithm BasicSA on an input partition-relation 
{Pfj,, P M ) terminates and outputs a partition-relation pair (P', P') such that // = /^p^p/) . 

Theorem 6.1. Let E be finite. Then, BasicSA terminates on any input domain \x € uco(p(£)) and 
BasicSA(/i) = Su.prc^)- 

Proof. Let (P CUI1 ■, P C urr) and (P nC xt , Pncxt) be, respectively, the current and next partition-relation pair in 
some iteration of BasicSA(/i). By line 5, P nC xt di Pcmi always holds. Moreover, if P ncx t = Pcmr then it 
turns out that P ncxt C P curr : in fact, if B, C G P curr , CTlpre(P) ^ and UP curr (C) % pre(UP curr (P)) 
then, by lines 6 and 7, UP nex t(C) £ UP curr (C) because there exists x G UP curr (C) such that x G" 
pre(UP curr (P)) so that if B x G P n0 xt = Pcurr is the block that contains x then B x n (UP ne xt(C)) = 
while B x C UP curr (C). Thus, either P ncx t -< Pcurr or P nex t Q Pcum so that, since the state space £ is 
finite, the procedure BasicSA terminates. 

Let // = BasicSA(^), namely, let // — /i(p<.p<) where (P',P') is the output of BasicSA on input 
(P M , Rfj,). Let {(Pi, -Ri)}ie[o,fc] be the sequence of partition-relation pairs computed by BasicSA, where 
(P ,i? ) = (PiiiRfi) and (P k ,R k ) = {P',R'). Let us first observe that for any i G [0, k), P i+ \ < Pi 
because the current partition is refined by the splitting operation in line 5. Moreover, for any i G [0, k) 
and C G Pj+i, note that UPi+i(C) C UPi(parentp. (C)), because the current relation is modified only at 
lines 6 and 7. 

Let us also observe that for any i G [0, k], Ri is a reflexive relation because Rq is reflexive and the 
operations at lines 6-7 preserve the reflexivity of the current relation. Let us show this latter fact. If 
C G Pnext is such that C D pre(P prov ) ^ then because, by hypothesis, B picv G P pr cv(Pprev), we have 
thatCnpre(Ui? pr ev(-Bpr OV )) ^ so that C C S = prc(UR prev (B piev )). Hence, if C G Pncxt nP prcv then 
C G P n ext(C), while if C G P n cxt N Pprcv then, by hypothesis, parent p prov (C) G P prev (parent p prov (C)) 
so that, by line 6, C G P nex t(C) also in this case. 
For any B G P' = Pk, we have that 



/i'(P) = [by definition of //] 

UR* k (B) C [as UP fe (P) C UP (parent Po (P))] 

UPg (parent Po (P)) = [as P = par(^) and P„ = P* = P M ] 

UP Al (parent par(Al) (P)) = [by Lemma|!2](ii), (par(/i), P p ) = (par(^ d ), P^d)] 

UP /J d(parent par ( At d)(P)) = [by definition of P^d ] 



Thus, since, by Lemma l4~2l (i). P' < par(/i'), by Lemma l2~4l (iv). P' < P M = par(/i d ) and both // and /i d 
are disjunctive, we have that for any X G p(£), 



U{C G par( M d ) 



C C A i d (parent par(Ald) (P))} 
/i d (parent par(Ald) (P)) 
/(B)- 



[by Lemmal2~4l(ii)l 
[by Lemmal2~4l(i)l 



f/(X)= [by LemmaEl(iii)] 

U{fi'(B) B G P', B n A ^ 0} C [as //(P) C /x d (P)] 

U{/i d (P) P G P', P n X ^ 0} = [by LemmaEl(iii)] 

/(A) C [as fi d C /-«] 

m(a). 
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Thus, p! is a refinement of /i. We have that P' -< par(//), R' = Rk is (as shown above) reflexive 
and because (P', P'} is the output partition-relation pair, for all B,C G P', if C D pre(P) ^ then 
UP'(C) C pre(UP'(P)). Hence, by Lemma l4~4l we obtain that p! is forward complete for pre. Thus, // 
is a disjunctive refinement of p, that is forward complete for pre so that p! C §u, pre (/•*)• 
In order to conclude the proof, let us show that §u,pre(/-t) E We first show by induction that for any 
i G [0, k] and P G P it we have that UP 4 (P) G img(S UiPro (^)): 

(j = 0) We have that (P , R ) = (P M , P M ) so that for any B G P , by LemmaEKii), UP (P) = U{C G 
par(» I C 1 C p(B)} = p{B). Hence, UP (P) G img(p) C img(S U:Pre (/i)). 

(i + 1) LetC G Pi+i = split(P i ,pre(Ui?i(B i ))) for some Bi G p. IfCnpre(P 4 ) = then, by lines 6- 
7, UPj+x(C) = Ui?i (parent P . (C)) so that, by inductive hypothesis, UPj+i(C) G img(S UiPre (/z)). 
On the other hand, if C D pre(P;) ^ then, by lines 6-7, UP i+ i(C) = UP^ (parent P . (C)) n 
pre(Ui?i(i?i)). By inductive hypothesis, we have that URi (parent P . (C)) G img(S UiPre (/i)) and 
L)Ri(Bi) G img(S UjPre (/i)). Also, since Su, pre (/-0 is forward complete for pre, pre(UPi(Pj)) G 

img(Su jP re 

(fi)). Hence, UP i+ i(C) G img(S u , prc (^)). 
As observed above, Rk is reflexive so that for any B G P&, B C UPfc(P). For any P G P', we have that 

Su,preG«)(P0 C [asPCUP fe (P)] 

(/i)(UP fe (P)) = [as UP fe (P) G img(Su, pre ( M ))] 

UR k (B)C [asP fe CP fe *] 

UPfe(P) = [by definition] 



Therefore, for any X G p(S), 



Su, prc (M)(^) C [as X C U{P G P' | P n X ± 0}] 

§u, P rc(^)(U{P G P' | P n X ^ 0}) = [as §u, P re(M) is additive] 

U{§u, P rc(M)(P) I P G P', P n X ^ 0} C [as § u>pre ( M )(P) C p'(B)] 

U{/i'(P) P G P', Bnl/0}= [as p! is disjunctive, by Lemmal2~4l(iii)l 

p'(X). 

We have therefore shown that § UjPre (/i) C //. □ 

Thus, BasicSA computes the forward {U, pre}-complete shell of any input abstract domain. As a 
consequence, BasicSA allows us to compute both simulation relation and equivalence when pi is the 
initial abstract domain. 

Corollary 6.2. Let % = (£, £) be a finite Kripke structure and pi G uco(p(S)) be the abstract domain 
induced by I. Then, BasicSA(/i£) = (P',R') where P' = P slm and, for any Si,S2 G S, (si,S2) G 

Psim ^ (Psim(si),P s im(s 2 )) G P'. 

Proof. Let /igc = §u,pre(A^)- By Theorem 16. II if BasicSAf^) = (P', P') then P(p',r>) = Mac- By 
Theorem 13. 11 par(/i3c) = Psim- By Lemma 14. 21 (i). P' ^ par(A t (P',ii'>) = paK/ 1 ^) = Psim- It remains 
to show that P slm — pax(p/pt i pn) < P'. Let {(Pi, Pi)}ie[o,fc] be the sequence of partition-relation pairs 
computed by BasicSA, where (P ,Po) = (P W ,P W ) and (Pk,Rk) = (P',R'). We show by induction 
that for any i G [0, k], we have that par(p/pt i p/\) ^ Pj. 

(i = 0) Since P(p' y B/) E A*£> we have that par^p/^/v) ^ par^) = Pq. 

(i + 1) Consider P G par^/p^p')). We have that P 1+ i = split(Pj, prc^(UPi(P;))) for some P,; G Pj. 
We have shown in the proof of Theorem 16. 1 1 that UPj(Pj) G /xac = l^(P'.R')- Since p/p^pj) is for- 
ward complete for pre, we also have that pre(UPi(P,)) G pipi,pj). Hence, P n pre^(UPi(Pi)) G 
{0,P}. By inductive hypothesis, par(/i/p/ #/\) r< P, so that there exists some C G Pi such that 



15 



1 ~ReRriedSA(PartitionRelation (P, Rel)) { 

2 forall B G P do prePrevRel(B) := E; 

3 while 35 G P such that pre(UPeZ(P)) / prePrevRel(B) do 

4 // Invi: VB € P. pre(UiZei(S)) C piePrevRel(B) 

5 // Inv 2 : VB,C6P.Cn pre(P) / =>• UPe;(C") C piePrevRel(B) 

6 Remove := prePrevRel(B) x pre(UPe£(P)); 

7 prePrevRel(B) := pre(uPe/(B)); 
p — P R — R- 

P ■- Split(P,pxePrevRel{B))\ 
forall C G P do 

fleZ(C) := {P G P | D C UPeZ(parent Pprcv (C"))}; 
if C G P then prePrevRel(C) := prePreuPe/ (parent P rev (£?))• 

forall C G P such that C n pre(P prcv ) / do 
|_ Rel{C) := {D G PeZ(C) J D n Remove = 0}; 
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15 } 

Figure 3: Refined Simulation Algorithm. 



P C C. Since P l+1 = split(P i ,pre^(Ui^(B i ))), note that if C ("1 pre^(UPi(P;)) ^ then 
C* n prc^(UP 4 (P 4 )) G and if C* \ (pre_>(UP*(P 4 ))) ^ then C \ (pre^(Ui^ € 

P i+1 . Moreover, if B n pre^UP^Pj)) = then P C C \ (pre^(Ui^(Bi))), while if P n 
pre^(UPi(Pi)) = P then BCCfl pre^(UPi(Pi)). In both cases, there exists some D G Pj+i 
such that BCD. 

Thus, P' = P sim . 

The proof of Theorem l6. 1 1 shows that R' is reflexive. Moreover, that proof also shows that for any B G P', 
l)R'(B) 6 ^jc- Then, for any P 6 P': 

UP'*(P) = [by definition of fJ,{ P >, R >)] 
HIP' ,ri){B) ^ [because R' is reflexive] 
fJ-iP'M^i^R'iB)) = [because /U<P',ii') = Mac] 
^3c(UP'(P)) = [because UP'(P) G ^c] 
UP'(P) 

and therefore P' is transitive. Hence, for any si, s 2 € S, 

[by Theorem 13. II 

[because /ijc = ^(P',i?')] 
[by definition of fitp^R')] 
[because P' = P sim and R'* = R'] 

□ 



(S1,S 2 ) G Psim 
S2 £ /-i3c({si}) 4^ 

(P'( Sl ),P'( S2 )) GP'*^ 

(P S im(si),P S im(s 2 )) GP'. 



6.2 Refining the Algorithm 

The BasicS A algorithm is refined to the RefinedSA procedure in Figure [3] This is obtained by adapting 
the ideas of Henzinger et al.'s RefinedSimilarity procedure in Figure Q] to our BasicSA algorithm. The 
following points show that this algorithm RefinedSA remains correct, i.e. the input-output behaviours of 
BasicSA and RefinedSA are the same. 

- For any block P of the current partition P, the predecessors of the blocks in the "previous" relation 
Rel prev (B) are maintained as a set prePrevRel(B). Initially, at line 2, piePrevRel(B) is set to 
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SA(PartitionRelation (P, Rel)) { 

II forall B G P do prePrevRel(B) := E; 
forall BePdo Remove(B) := E \ pre(uPd(P)); 
while 3P G P such that Remove(B) / do 

// Inv 3 : VC G P. Remove(C) = prePrevRel(C) \ pre(uPe£(C)) 
// Inv 4 : VC € P. Spte(P, prePreuReZ(C)) = P 
// prePreuReZ(P) := pre(Ufle/(P)); 
Remove := Remove(B); 
Remove(B) := 0; 

Pprev • — P, 

P P" 

Jprev - — J i 

P := Split(P, Remove); 
forall CePdo 

Pe/(C) := {D G P | D C UPeZ(parent p rov (C))}; 



if C G P \ Pprev then 

Remove(C) := Remove (parent Pprov (C)); 
/ / prePrevRel(C) := prePret?_ReZ(parent p 



,(C)); 



RemoveList := {P G P | P C Pemone}; 
forall C G P such that C n pre(P prov ) / do 
forall D G RemoveList do 
if P G Pe/(C) then 

Pe/(C) := Rel(C) \ {P}; 

forall s G pre(P) such that s pre(uPe/(C)) do 
i Remove(C) := Remove(C) U {s}; 



Figure 4: The Simulation Algorithm SA. 



contain all the states in S. Then, when a block B is selected by the while-loop at some iteration i, 
prePrevRel(B) is updated at line 7 in order to save the states in pre(UPeZ(P)) at this iteration i. 

- If C is a newly generated block after splitting P and parent Ppr v (C) is its corresponding parent block 
in the partition before splitting then prePrevRel(C) is set at line 12 as pre PreuRd (parent Pprov (C)). 
Therefore, since the current relation Rel decreases only — i.e., if i and j are iterations such that j 
follows i and B, B' are blocks such that B' C B then URelj(B') C URek(B) — at each iteration, 
the following invariant Invi holds: for any block B e P, pre(URel(B)) C prePrevRel(B). 
Initially, Invi is satisfied because for any block B, prePrevRel(B) is initialized to E at line 2. 

- The crucial point is the invariant Inv 2 : if C^ 33 P and D e Rel(C) then D C prcPrevRel(B). 
Initially, this invariant property is clearly satisfied because for any block B, prePrevRel(B) is ini- 
tialized to S. Morever, Inv 2 is maintained at each iteration because at line 6 Remove is set to 
prePrevRel(B) \ pre(UPd(i?)) and for any block C such that C^ 33 i?p rov if some block D is 
contained in Remove then D is removed from Rel(C) at line 14. 

Thus, if the exit condition of the while-loop of RcfinedSA is satisfied then, by invariant Inv 2 , the exit 
condition of BasicSA is satisfied as well. 

Finally, let us remark that the exit condition of the while-loop, namely MB e P. prc(L)Rel(B)) = 
prePrevRel(B), is strictly weaker than the exit condition that we would obtain as counterpart of the exit 
condition of the while-loop of Henzinger et al.'s RefinedSimilarity procedure, i.e. MB e P. Rel(B) = 

Relprcv(B). 
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6.3 The Final Algorithm 

Following the underlying ideas that lead from RefinedSimilarity to HHK, the algorithm RcfincdSA is 
further refined to its final version SA in Figure [4] The idea is that instead of recomputing at each itera- 
tion of the while-loop the set Remove = prePrevRel(B) \ pre(L)Rel(B)) for the selected block B, we 
maintain a set of states Remove(B) C £ for each block B of the current partition. For any block C, 
Remove(C) is updated in order to satisfy the invariant condition Inv3: Remove(C) contains exactly the 
set of states that belong to prePrevRel(C) but are not in pre(URel(C)), where prePrevRel(C) is logically 
defined as in RefinedSA but is not really stored. Moreover, the invariant condition Inv4 ensures that, for 
any block C, prePrevRel(C) is a union of blocks of the current partition. This allows us to replace the 
operation Split(P,pre(URel(B))) in RefinedSA with the equivalent split operation Split (P, Remove). 
The correctness of such replacement follows from the invariant condition I11V4 by exploiting the following 
general remark. 

Lemma 6.3. Let P be a partition, T be a union of blocks in P and S C T. Then, Split (P, S) = 
Split(P,T \ S). 

Proof. Assume that B n T = 0, so that Bf)S = 0. Then, 

b n (t \ s) = b n (t n ->S) = = b n s 

and 

B\(T\5) = (Bn -iT) U(BnS) = B = B\ S 
so that B is split neither by T \ S nor by S. 

Otherwise, if B n T ^ 0, because T is a union of blocks, then B CT, Then, 

B n (T \ S) = B n (T n -.S) = £ n ->S = B \ 5 

and 

B s (T \ S) = {B n -.T) U (B n 5) = B n 5 

so that £? is split by T \ S into i?i and B-2 if and only if B is split by S 1 into B\ and i?2- We have thus 
shown that Split (P, 5) = S^(P, T \ S). □ 

The equivalence between SA and RefinedSA is a consequence of the following observations. 

- Initially, the invariant properties Inv3 and Inv4 clearly hold because for any block B, prePrevRel (B) = 
S. 

- When a block B picv of the current partition is selected by the while-loop, the corresponding re- 
move set Remove(B prev ) is set to empty at line 9. The invariant I11V3, namely VC. Remove{C) = 
prePrevRel(C) \ pre(URel(C)), is maintained at each iteration because for any block C such that 
C^ 33 -E>p ICV the for-loop at lines 23-24 incrementally adds to Remove(C) all the states s that are in 
piePrevRel(C) but not in pre(U Rel(C)). 

- If C is a newly generated block after splitting P and parent p pr v (C) is its corresponding parent 
block in the partition before splitting then Remove(C) is set to Remove (parent Ppr cv (C)) by the 
for-loop at lines 13-17. 

- As in RefinedSA, for any block C such that C-> 33 -B pre v, all the blocks that are contained in 
Remove(B prev ) are removed from Rel(C) by the for-loop at lines 20-22. 

If the exit condition of the while-loop of SA is satisfied then, by Invi and I11V3, the exit condition of 
RcfincdSA is satisfied as well. 
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Figure 5: Partition representation. 



7 Complexity 
7.1 Data Structures 

SA is implemented by using the following data structures. 

(i) The set of states £ is represented as a doubly linked list where each state s G £ (represented as an 
integer) stores the list of its predecessors in pre({s}). This provides a representation of the input 
transition system. Any state s E £ also stores a pointer to the block of the current partition that 
contains s. 

(ii) The states of any block B of the current partition are consecutive in the list £, so that B is represented 
by a record that contains two pointers to the first and to the last state in B (see Figure [3). This 
structure allows us to move a state from a block to a different block in constant time. Moreover, any 
block B stores its corresponding remove set B .Remove, which is represented as a list of (pointers 
to) states. 

(iii) Any block B additionally stores an integer array RelCount that is indexed over £ and is defined as 
follows: for any x S E, B.RelCount(x) = Y^ceRelfB) \{{ x ' V) I x ^Vi V G C}| i s me number of 
transitions from x to some block C G Rel(B). The array RelCount allows to implement in constant 
time the test s pre(UPd(C)) at line 23 as C.RelCount(s) = 0. 

(iv) The current partition is stored as a doubly linked list P of blocks. Newly generated blocks are 
appended or prepended to this list. Blocks are scanned from the beginning of this list by checking 
whether the corresponding remove set is empty or not. If an empty remove set of some block B 
becomes nonempty then B is moved to the end of P. 

(v) The current relation Rel on the current partition P is stored as a resizable \P\ x |P| boolean matrix 
ifTTl Section 17.4]. The algorithm adds a new entry to this matrix, namely a new row and a new 
column, as long as a block B is split at line 12 into two new blocks B \ Remove and B n Remove: 
the new block B \ Remove replaces the old block B in P while a new entry in the matrix Rel 
corresponds to the new block B n Remove. We will observe later that the overall number of newly 
generated blocks by the splitting operation at line 12 is exactly given by 2(|P S j m | — |im|)- Hence, 
the total number of insert operations in the matrix Rel is |P S im| ~ |Pn| < |Psim|- Since an insert 
operation in a resizable array (whose capacity is doubled as needed) takes an amortized constant 
time, the overall cost of inserting new entries to the matrix Rel is in (3(|P S j m | 2 )-time. Let us recall 
that the standard C++ vector class implements a resizable array so that a resizable boolean matrix 
can be easily implemented as a C++ vector of boolean vectors: in this implementation, the algorithm 
adds a new entry to a N x N matrix by first inserting a new vector of size N + 1 containing false 
values and then by inserting + 1 false values in the N + 1 boolean vectors. 



7.2 Space and Time Complexity 

Let B e P n be some block of the initial partition P n and let (Bi)i e j t be the sequence of all the blocks 
selected by the while-loop in a sequence It of iterations such that: 

(a) for any i £ It, B t C B; 
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(b) if an iteration j £ It follows an iteration i £ It, denoted by i < j, then Bj is contained in Bi. 

Observe that B is the parent block in Pj n of all the Pi's. Then, one key property of the SA algorithm is 
that the remove sets in {Remove(Bi)}i e j t are pairwise disjoint so that X^e/t I Remove(Bi)\ < |S|. This 
property guarantees that if the test D £ RemoveList at line 20 is positive at some iteration i £ If then 
for any block D' C D and for any successive iteration j > i, with j £ If, the test D' £ RemoveList 
will be negative. Moreover, if the test P £ Rel{C) at line 21 is positive at some iteration i £ P, so that 
D is removed from Rel(C), then for all the blocks D' and C" such that D' C D and C" C C the test 
P' £ Rel(C) will be negative for all the iterations j > i. As a further consequence, since a splitting 
operation Split (P, Remove) can be executed in 0(\Remove\)-time, it turns out that the overall cost of all 
the splitting operations is in 0(|P S i m ||E|)-time. Furthermore, by using the data structures described by 
points (iii) and (v) in Section lTTl the tests D £ Rel(C) at line 21 and s £" pre(UPe/(C)) at line 23 can be 
executed in constant time. A careful analysis that exploits these key facts allows us to show that the total 
running time of SA is in 0(|P s i m || _> |)- 

Theorem 7.1. The algorithm SA runs in 0(|P s i m ||->|)-ft'me and 0(|P s i m ||£| log \Y,\)-space. 

Proof. Let It denote the sequence of iterations of the while-loop for some run of SA, where for any 
i,j £ It, i < j means that j follows i. Moreover, for any i £ It, Bi denotes the block selected by the 
while-loop at line 4, RemoveiBi) ^ denotes the corresponding nonempty remove set, pre(UPeZ(Pj)) 
denotes the corresponding set for Bi, while (Pi, Relj) denotes the partition-relation pair at the entry point 
of the for-loop at line 19. 

Consider the set H = {Bi £ p | i £ It} of selected blocks and the following relation on 23: 

Bi < B 3 <=> Bi<Z Bj or (P* =B j h i> j) 

It turns out that (23, <} is a poset. In fact, < is trivially reflexive. Also, < is transitive: assume that Bi < Bj 
and Bj < Bk', if B. L = Bj = Bk then i > j > k so that Bi < Bk', otherwise either Bi C Bj or Bj C P fe 
so that Bi C Bk and therefore P; < Bk- Finally, < is antisymmetric: if Bi < Bj and Bj < P^ then 
P; = Bj and i > j > i so that i = j. Moreover, P^ < P^ denotes the corresponding strict order: this 
happens when either Bi C Bj or P^ = Bj and i > j. 
The time complexity bound is shown incrementally by the following points. 

(A) For any Pj, Bj £ 23, if P s ; C Bj and j < i then Remove(Bi) n Remove(Bj) = 0. 

Proof. By invariant I11V3, Remove(Bj) n pre(Ui?eZj(Pj)) = 0. At iteration j, Remove(Bj) is set 
to at line 9. If Pj generates, by the splitting operation at line 12, two new blocks Pi, P2 C Bj then 
their remove sets are set to at line 16. Successively, SA may add at line 24 of some iteration k > j 
a state s to the remove set Remove(C) of a block C C P^ only if s £ pre(UPeZ/ c (C)). We also 
have that UPe/ fe (C) C URelj(Bj) so that pre(UPe/ fe (C)) C pre(UPe/ i (P i )). Thus, if Pi C Bj 
and i > j then Remove(Bi) C pre(UPeZj(Pj)). Therefore, Remove(Bj) n RemoveiBi) C 
Remove(Bj) n pre(UPel J (P i )) = 0. 

(B) The overall number of newly generated blocks by the splitting operation at line 1 2 is 2 ( | P S i m | — | P; n | ) . 

Proof. Let {Pi}ig[o, n ] be the sequence of partitions computed by SA where Po is the initial partition 
Pn, P« is the final partition P S i, n and for all i £ [0, n — 1], P,+i r< Pj. The number of newly 
generated blocks by one splitting operation that refines Pj to Pj+i is given by 2(|Pj+i| — \Pi\). 
Thus, the overall number of newly generated blocks is Y^,7=o 2(|Pi+i| — |Pj|) = 2(|P s i m | — |Pi n |)- 

(C) The time complexity of the for-loop at line 3 is in 0(|Pi„||->|). 

Proof. For any P £ P; n , pre(UPeZ(P)) is computed in 0(|->|)-time, so that S \ pre(UPeZ(P)) 
is computed in 0(|->|)-time as well. The time complexity of the initialization of the remove sets is 
therefore in O ( | p n 1 1 ^ | ) ■ 

(D) The overall time complexity of lines 8 and 18 is in 0(|P s i m ||E|). 

Proof. Note that at line 18, Remove is a union of blocks of the current partition P. As described 
in Section l7TI (i). each state s also stores a pointer to the block of the current partition that contains 
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ListOfBlocks Split (PartitionRelations P, SetOfStates S) { 
ListOfBlocks split = empty; 
forall s in S do { 
Block B - s. block; 

if {B. intersection == NULL} then { 
B . intersection = new Block; 

if (B . remove == 0) then P. prepend (B. intersection) ; 

else P . append (B . intersection) ; 

split . append (B) ; 

} 

move s from B to B . intersect ion; 
if (B == empty) then { 

B = copy (B . intersection) ; 

P . remove {B. intersection) ; 

delete B . intersection; 

split . remove (B) ; 

} 

} 

return split; 



Splitting Procedure ( P , S ) { 

/ * Pprcv — P ; * / 

ListOfBlocks split - Split (P,S); 

/* assert (split == {B\S £ P | B \S ^ Pprev }) */ 
forall B in split do { 

Rel . addNewEntry (B. intersection) ; 

B . intersection . Remove = copy (B. Remove } ; 

} 

forall B in P do 

forall C in split do Rel (B, C . intersection) = Rel(B,C); 
forall B in split do { 

forall C in P do Rel (B . intersection, C) = Rel(B,C); 

forall x in S do B . intersection . RelCount (x) = B . RelCount (x) ; 

} 

} 



Figure 6: C++ Pseudocode Implementation of the Splitting Procedure. 

8. The list of blocks RemoveList is therefore computed by scanning all the states in Remove(Bi), 
where Bi is the selected block at iteration i, so that the overall time complexity of lines 8 and 18 
is bounded by 2 J2 ieIt | Remove(Bi)\. For any block E G P s ; m of the final partition we define the 
following subset of iterations: 

It E = {i G It | E C Bi}. 

Since for any i 6 It, P slm ^ Pi, we have that for any i 6 It there exists some E € P s i m such 
that i G He- Note that if i, j G He and i < j then Bj C Bi and, by point (A), this implies that 

Remove(Bi) n Remove(Bj) = 0. Thus, 

2 J2ieit I R em ove(Bi)\ < [by definition of It E ] 
2 J2 E eP siln ^2ieit E I R emove {Bi)\ < [as the sets in {Remove(Bi)}i e j tE are pairwise disjoint] 

2|Psim||S|. 

(E) The overall time complexity of line 10, i.e. of copying the list of states of the selected block B, is in 

0(|P S im||S|). 

Proof. For any block E G P s i m of the final partition we define the following subset of iterations: 

It e = {ieIt\EQ Remove{Bi)}. 

Since for any i G It, P s \ m < Pi and Remove(Bi) is a union of blocks of Pi, it turns out that for 
any i G It there exists some E G -P s ; m such that i G He- Note that if i, j G Re and i ^ j then 
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Bj (~l Bi = 0: this is a consequence of point (A) because E C Remove(Bi) (~l Remove(Bj) ^ 
implies that Bj % Bi and B{ <2 B 3 so that BiDBj = 0. Thus, 

Sie/t I ^ I < [ b y definition of It E ] 
EEeP sim Sie/tE l Bi I - [ as * e blocks in {Bi\ieit E are pairwise disjoint] 

|P sim ||S|. 

(F) The overall time complexity of lines 1 1 - 17 is in O ( | P S im | I I ) • 

Proof. Figure [6] describes a C++ pseudocode implementation of lines 11-17. By using the data 
structures described in Section 17.11 and in particular in Figure [5] all the operations of the proce- 
dure Split take constant time so that any call Split(P, S) takes 0(151) time. Let us now consider 

SplittingProcedure. 

- The overall time complexity of the splitting operation at line 24 is in 0(|P S i m | Each 
call Split (P, Remove (Bi)) takes 0(\ Remove(Bi)\) time. Then, analogously to the proof 
of point (D), the overall time complexity of line 24 is bounded by J2ieit I Pemove(Pj)| < 

I Psim 1 1 S | • 

- The overall time complexity of the for-loop at lines 26-29 is in 0(|P S i m ||£|). It is only worth 
noticing that since the boolean matrix that stores Rel is resizable, each operation at line 27 
that adds a new entry to this resizable matrix has an amortized cost in 0(|P S i m |): in fact, the 
resizable matrix is just a resizable array A of resizable arrays so that when we add a new entry 
we need to add a new entry to A and then a new entry to each array in A (cf. point (v) in 
Section l7TTT i. Thus, the overall time complexity of line 26 is in 0(|P s i m | 2 )- 

- The overall time complexity of the for-loop at lines 30-31 is in 0(\P S i m \ 2 ). 

- The overall time complexity of the for-loop at lines 32-35 is in 0(|P S i m ||^|). This is a con- 
sequence of the fact that the overall time complexity of the for-loops at lines 33 and 34 is in 

0(|Pim|h|). 

Thus, the overall time complexity of SplittingProcedure(P, Remove) is in 0(|P s ; m | |->|). 

(G) The overall time complexity of lines 19-21 is in 0(|P S j m | 

Proof. For any Bi 6 23, let arr(P;) = ^2 x€B . \ pre({a;})| denote the number of transitions that end 
in some state of Bi and rem(Pi) = \{D G Pi \ D C Remove(Bi)}\ denote the number of blocks of 
Pi contained in RemoveiBi). We also define two functions /<,/<: 23 p(P s i m ) as follows: 

/<(Bi) = {D e P sim I D n {U{Remove{B 3 ) \ B 3 E 13, B, < Bj}) = 0} 
/<(B<) = {D e P sim | D n (U{Remove(Bj) | B 3 e 23, B, < Bj}) = 0} 

Let us show the following property: 

VP, e 23 . rem(Bi ) + | f< (Bi ) \ < \ f < (B z ) | . ($) 

We first observe that since P s i m ^ Pi, rem(Pi) < \{D € P s i m | D C Remove (Bi)}\, Moreover, 
the sets {D G P S j m | D C Remove(Bi)} and /<(Bj) are disjoint and their union gives f < (Bi). 
Hence, 

rem(Bi) + |/<(B()| < 
\{D € Ph„ | D C Remove(Bi)}\ + \f<(B t )\ = 
\{D G P sim | D C Remove(Bi)} U f<(B z )\ = 

\f<(Bi)\. 
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Given, B k G 23, let us show by induction on the height h(B k ) > of Bk in the poset (23, <) that 

E Bi < Bh ™*( B i) rem(Bi) < arr(B fc )|/<i(B fc )|. (*) 

(h{B k ) = 0): By property {%), rem(B k ) < |/<(-B fc )| so that 

Y.Bi<B k arr(Bi) rem(Bi) = air{B k ) rem(B fc ) < arr(B fe )|/<,(S fe )|. 

(/i(-Bfc) > 0): Letmax({P 4 6 23 | B l < B k }) = {Ci,...,C„}. Note that if i ^ j then Cj n Cj = 0, 
so that Ej arr(C) < arr(Bfc), since UjCj C Let us observe that for any maximal C /< (Cj) C 
/<(B fe ) because U{Remove(Bj) \ Bj G 23, B k < Bj} C U{Jfemoue(B,-) | Bj G 23, C < Bj} 
since B& < Bj and Ci < B k imply Cj <3 Bj . 

Hence, we have that 

[by maximality of C's] 
[by inductive hypothesis on /i(C) < h(B k )] 
[as/<(Ci)C/<(fl fc )] 
t as Ec, arr (Ci) < arr(B fc )] 

[by (J), rem(B fc ) + \f<(B k )\ < |/<(B fc )] 

Let us now show that the global time-complexity of lines 19-21 is in 0(|P s im||-H)- Let max(23) = 
{Mi, M k } be the maximal elements in 23 so that for any i ^ j, Mj n Mj = 0, and in turn we 
have that EMemax(S) arr (^j) ^ HI- By using the data structures described in Section I7T1 the 
test D G Rel(C) at line 21 takes constant time. Then, the overall complexity of lines 19-21 is 

[as the Mi's are maximal in 23] 
[by property (*) above] 

t as EM,em»(S) arr W) < hll 

(H) The overall time complexity of lines 22-24 is in O ( P S j m 1 1 -» | ) . 

Proof. Let CP denote the multiset of pairs of blocks (C,D) G p that are scanned at lines 19-20 at 
some iteration i G It such that D G Reh(C). By using the data structures described in Section 177X1 
the test s G" pre(Ui?eZ(C)) and the statement Rel(C) :— Rel(C) \ {D} take constant time. More- 
over, the statement Remove(C) := Remove{C) U {s} also takes constant time because if a state s 
is added to Remove{C) at line 24 then s was not already in Remove(C) so that this operation can 
be implemented simply by appending s to the list of states that represents Remove(C). Therefore, 
the overall time complexity of the body of the if-then statement at lines 21-24 is E(c D)ev arr (^)- 
We notice the following fact. Let i, j G It such that i < j and let (C, Bj) and (C, 13 j) be pairs 
of blocks scanned at lines 19-20, respectively, at iterations i and j such that Dj C B*j. Then, 
if the test G ReU{C) is true at iteration i then the test Dj G Relj(C) is false at iteration j. 
This is a consequence of the fact that if D G Reh(C) then D is removed from Reh(C) at line 22 
and LLReZj(C) C U-Re^C) so that B n URel/(C) = 0. Hence, if (C,B),(C,B') G CP then 
B n D 1 = 0. We define the set 6 = {C | 3D. (C, B) G CP} and given C G 6, the multiset 
23(7 = {B | (C, B) G CP}. Observe that |6| is bounded by the number of blocks that appear in 



z2 Bl <B k arr(Bj)rem(Bj) = 
arr(B fc ) rem(B fc ) + Ec ; Er^C; arr(B) rem(B) < 
arr(B fc ) rem(B fc ) + E Cj arr(C)|/<(C)| < 
arr(B fe ) 10111(5,.) + |/«(B fc )| E C< arr (^) < 
arr(B fe )rem(B fc ) + \f<{B k )\ arr(B fc ) = 
arr(B fc )(rem(B fe ) + |J<(B fe )|) < 

arr(B fe ) !/<(#*) I- 



Eb !G s arr (- B rem(Pi) = 
Ez^m, arr(B) rem(B) < 

Z)A/i£max(S) arr (-^i) l-Psim = 
l^sim|EM lG ma X (3) arr (^) < 
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Initialize {PartitionRelation P) { 
forall B in P do { 

B. Remove = prc(E) v. pre (U{C in P I Rel(B,C)}); 
forall x in E do B . RelCount (x) = 0; 

J 

forall B in P do 

forall y in B do 

forall x in pre ( { y } ) do 

forall C in P such that Rel(C,B) do C . RelCount (x) ++ ; 

> 

SA (PartitionRelation P) { 
Initialize ( P ) ; 

forall B in P such that (B. Remove ^ 0) do { 
Set Remove - B. Remove; 
B. Remove = 0; 
Set B prov = B; 

Splitting Procedure (P, Remove) ; 

ListOf Blocks RemoveList = {D £ P I D C Remove}; 
forall C in P such that (C n prc(B prcv ) ^ 0) do 
forall D in RemoveList do 
if (Rel(C,D>) then { 
Rel(C,D) = 0; 
forall d in D do 

forall x in pre(d) do { 
C. RelCount (x) — ; 
if (C. RelCount (x) == 0) then ( 
C. Remove = C. Remove U {x}; 
P . moveAtTheEnd (C) ; 

} 

} 

} 

} 

} 



Figure 7: C++ Pseudocode Implementation of SA. 



some partition P;, so that by point (B), |C| < 2(|P sim - |Pin|) + \Pin\ < 2|P sim |. Moreover, the 
observation above implies that Be is indeed a set and the blocks in Be are pairwise disjoint. Thus, 

Scee Sdgd c arr (-D) < t as the blocks in T>c are pairwise disjoint] 

EceeH< ^ |C| < 2|P sim |] 

2|Psim|h|. 

Summing up, we have shown that the overall time-complexity of SA is in 0(|P s i m | |->|). 

The space complexity is in 0(|£| log |P sim | + |P sim | + |P sim | 2 + |P sim ||S| log |E|) = 0(|P sim ||S| log 

where: 

- The pointers from any state s € S to the block of the current partition that contains s are stored in 

0(|£| log |P 8im |) space. 

- The current partition P is stored in 0(|P S im|) space. 

- The current relation Rel is stored in 0(|P S i m | 2 ) space. 

- Each block of the current partition stores the corresponding remove set in 0(|S|) space and the 
integer array RelCount in 0(|E| log |E|), so that these globally take (9(|P S i m ||E| log |S|) space. □ 



8 Experimental Evaluation 

A pseudocode implementation of the algorithm SA that shows how the data structures in Section [77X1 are 
actually used is in Figure [71 where Splitting Procedure has been introduced above in Figure [6] We im- 
plemented in C++ both our simulation algorithm SA and the HHK algorithm in order to experimentally 
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compare the time and space performances of SA and HHK. In order to make the comparison as meaningful 
as possible, these two C++ implementations use the same data structures for storing transitions systems, 
sets of states and tables. 

Our benchmarks include systems from the VLTS (Very Large Transition Systems) benchmark suite [ 30 1 
and some publicly available Esterel programs. These models are represented as labeled transition systems 
(LTSs) where labels are attached to transitions. Since the versions of SA and HHK considered in this paper 
both need as input a Kripke structure, namely a transition system where labels are attached to states, we 
exploited a procedure by Dovier et al. [16] that transforms a LTS M into a Kripke structure M' in such 
a way that bisimulation and simulation equivalences on M and M' coincide. This transformation acts as 
follows: any labeled transition si — > S2 is replaced by two unlabeled transitions si — > n and n — > S2, 
where n is a new node that is labeled with I, while all the original states in M have the same label. This 
labeling provides an initial partition on M' which is denoted by P m . Hence, this transformation grows the 
size of the model as follows: the number of transitions is doubled and the number of states of M 1 is the 
sum of the number of states and transitions of M. Also, the models cwi_3_14, vasy_5_9, vasy_25_25 and 
vasy_8_38 have non total transition relations. The vasy_* and cwi_* systems are taken from the VLTS suite, 
while the remaining systems are the following Esterel programs: WristWatch and ShockDance are taken 
from the programming examples of Esterel [17], ObsArbitrer4 and AtLeastOneAck4 are described in the 
technical report [3 1, lift, NoAckWithoutReq and one_pump are provided together with the fc2symbmin tool 
that is used by Xeve, a graphical verification environment for Esterel programs ll4l [3TI . 

Our experimental evaluation was carried out on an Intel Core 2 Duo 1 .86 GHz PC, with 2 GB RAM, 
running Linux and GNU g++ 4. The results are summarised in Table Q] where we list the name of the 
transition system, the number of states and transitions of the transformed transition system, the number 
of blocks of the initial partition, the number of blocks of the final simulation equivalence partition (that is 
known when one algorithm terminates), the execution time in seconds and the allocated memory in MB 
(this has been obtained by means of glibc-memusage) both for HHK and SA, where o.o.m. means that the 
algorithm ran out of memory (2GB). 

The comparative experimental evaluation shows that SA outperforms HHK both in time and in space. 
In fact, the experiments demonstrate that SA improves on HHK of about two orders of magnitude in time 
and of one order of magnitude in space. The sum of time and space measures on the eight models where 
both HHK and SA terminate is 64.555 vs. 1.39 seconds in time and 681.303 vs. 52.102 MB in space. Our 
experiments considered 18 models: HHK terminates on 8 models while SA terminates on 14 of these 18 
models. Also, the size of models (states plus transitions) where SA terminates w.r.t. HHK grows about one 
order of magnitude. 

9 Conclusion 

We presented a new efficient algorithm for computing the simulation preorder in 0(|P s im| |->|)-time and 
0(|P s i m ||S| log |E|)-space, where P s j m is the partition induced by simulation equivalence on some Kripke 
structure (£, ->). This improves the best available time bound 0(|E||->-|) given by Henzinger, Henzinger 
and Kopke's ||231 and by Bloom and Paige's [2| simulation algorithms that however suffer from a space 
complexity that is bounded from below by ft ( | £ | 2 ) . A better space bound is given by Gentilini et al.'s 1 1 8 1 
algorithm — subsequently corrected by van Glabbeek and Ploeger |21 1 — whose space complexity is in 
0(|P S i m | 2 + |S| log |P s im|X but that runs in 0(|P S i m | 2 |->-|)-time. Our algorithm is designed as an adaptation 
of Henzinger et al.'s procedure and abstract interpretation techniques are used for proving its correctness. 

As future work, we plan to investigate whether the techniques used for designing this new simulation 
algorithm may be generalized and adapted to other behavioural equivalences like branching simulation 
equivalence (a weakening of branching bisimulation equivalence 03]). It is also interesting to investigate 
whether this new algorithm may admit a symbolic version based on BDDs. 

Acknowledgements. The authors are grateful to the anonymous referees for their detailed and helpful 
comments and to Silvia Crafa for many useful discussions. This work was partially supported by the FIRB 
Project 'Abstract interpretation and model checking for the verification of embedded systems", by the 
PRIN 2007 Project 'AIDA2007: Abstract Interpretation Design and Applications" and by the University of 
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Input 
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SA 
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M 
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1 p 1 
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Time 


Space 


Time 


Space 


cwi_lJ2 


4339 


4774 


27 


2401 


22.761 


191 


0.76 


41 


cwi_3_14 


18548 


29104 


3 


123 


— 


o.o. m. 


0.96 


9 


vasy_0_l 


1513 


2448 


3 


21 


1.303 


27 


0.03 


0.229 


vasy_10_56 


67005 


112312 


13 


?? 


— 


o.o.m. 


— 


o.o.m. 


vasy_l_4 


5647 


8928 


7 


87 


37.14 


407 


0.28 


2 


vasy_18_73 


91789 


146086 


18 


?? 


— 


o.o.m. 


— 


o.o.m. 


vasy_25_25 


50433 


50432 


25217 


?? 


— 


o.o.m. 


— 


o.o.m. 


vasy_40_60 


100013 


120014 


4 


?? 


— 


o.o.m. 


— 


o.o.m. 


vasy_5_9 


15162 


19352 


32 


409 


— 


o.o.m. 


1.63 


24 


vasy_8_24 


33290 


48822 


12 


1423 


— 


o.o.m. 


5.95 


182 


vasy_8_38 


47345 


76848 


82 


963 


— 


o.o.m. 


8.15 


176 


WristWatch 


1453 


1685 


23 


1146 


1.425 


31 


0.15 


6 


ShockDance 


379 


459 


10 


327 


0.75 


2 


0.03 


0.547 


ObsArbitrer4 


17389 


21394 


10 


159 




o.o.m. 


0.3 


11 


AtLeastOneAck4 


435 


507 


18 


112 


0.363 


2 


0.02 


0.219 


lift 


138 


163 


33 


112 


0.11 


0.303 


0.02 


0.107 


NoAckWithoutReq 


1212 


1372 


18 


413 


0.703 


21 


0.1 


2 


one_pump 


15774 


17926 


22 


3193 




o.o.m. 


13.64 


194 



Table 1 : Results of the experimental evaluation. 



Padova under the Project "Formal methods for specifying and verifying behavioural properties of software 
systems". This paper is an extended and revised version of l28l . 
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